Sains Malaysiana 55(2)(2026): 209-219
http://doi.org/10.17576/jsm-2026-5502-03
A Review of CNN-Based Typical Urban
Land Cover Segmentation Techniques in Multispectral Remote Sensing Imagery
(Suatu Ulasan Teknik Segmentasi Litupan Tanah Bandar Tipikal Berasaskan CNN dalam Imej Penderiaan Jauh Multispektral)
ZHAO HAIMENG1,2,
RAIHANI MOHAMED1,* & NG SENG BENG1
1Faculty
of Computer Science and Information Technology, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, Malaysia
2College
of Artificial Intelligence, Guilin University of Aerospace Technology, Guilin,
Guangxi, 541004, China
Received: 20 June 2025/Accepted: 28 January 2026
Abstract
Compared
with visible-light remote sensing, multispectral remote sensing provides
multi-band land surface information and enhances spectral separability through
data fusion, thereby enabling more accurate surface representation. However,
spectral redundancy, resolution discrepancies, and highly complex urban
environments impose greater challenges on existing methods. Deep learning
approaches based on convolutional neural network (CNN) offer superior
capabilities in extracting and integrating multispectral features, enabling
more accurate urban land cover segmentation. This review focuses on pixel-level
urban land cover segmentation and systematically summarizes recent advances in
deep learning for multispectral remote sensing. First, we emphasize that the rich
spectral information and spatial complementarity of multispectral data
effectively enhance segmentation performance and alleviate ambiguities caused
by the ‘same spectrum-different objects’ and ‘same object-different spectra’.
Second, we review 19 publicly available multispectral datasets, highlighting
differences in spectral bands, spatial resolution, and application scenarios,
and summarize a standardized preprocessing pipeline including radiometric
calibration, geometric correction, band normalization, and spectral
dimensionality reduction to support reproducibility. Third, we discuss
representative spectral-spatial feature extraction and cross-scale context
modeling strategies, covering dilated convolution, 3D-2D hybrid structures,
dual-branch architectures, and multi-scale enhancement modules. Extensive
comparative experiments on ISPRS Potsdam and GID datasets further demonstrate
the applicability and performance differences of representative models.
Finally, future research trends and directions are discussed, encompassing
multi-temporal and multi-scale temporal learning, cross-modal fusion, and the
lightweight design of complex models.
Keywords:
Convolutional neural network (CNN); multispectral features; remote sensing
data; semantic segmentation; surface feature extraction
Abstrak
Dibandingkan dengan penderiaan jauh cahaya tampak, penderiaan jauh multispektral menyediakan maklumat permukaan tanah pelbagai jalur dan meningkatkan kebolehbezaan spektral melalui penggabungan data, seterusnya membolehkan perwakilan permukaan yang lebih tepat. Walau bagaimanapun, pertindihan spektral, perbezaan resolusi dan persekitaran bandar yang sangat kompleks menimbulkan cabaran lebih besar terhadap kaedah sedia ada. Pendekatan pembelajaran mendalam berasaskan rangkaian neural konvolusi (CNN) menawarkan keupayaan unggul dalam mengekstrak dan mengintegrasikan ciri multispektral, membolehkan pengasingan litupan tanah bandar yang lebih tepat. Ulasan ini memberi tumpuan pada pengasingan liputan tanah bandar per tahap piksel dan secara sistematik merumuskan kemajuan terkini dalam pembelajaran mendalam untuk penderiaan jauh multispektral. Pertama, kami menekankan bahawa maklumat spektral yang kaya dan pelengkap reruang data multispektral berkesan meningkatkan prestasi pengasingan dan mengurangkan kekeliruan akibat ‘spektrum sama-objek berbeza’ dan ‘objek sama-spektrum berbeza’. Kedua, kami mengkaji 19 set data multispektral yang tersedia secara awam, menyoroti perbezaan dalam jalur spektral, resolusi spasial dan senario aplikasi, serta merumuskan saluran prapemprosesan piawai termasuk kalibrasi radiometrik, pembetulan geometri, normalisasi jalur dan pengurangan dimensi spektral untuk menyokong kebolehulangan. Ketiga, kami membincangkan strategi pengekstrakan ciri spektral-reruang dan permodelan konteks silang-skala, merangkumi konvolusi dilasi, struktur hibrid 3D-2D, seni bina dwi-cabang dan modul peningkatan multi-skala. Uji kaji perbandingan luas pada dataset
ISPRS Potsdam dan GID seterusnya menunjukkan keberkesanan dan perbezaan prestasi model wakil. Akhirnya, tren dan arah penyelidikan masa depan dibincangkan, termasuk pembelajaran temporal berbilang-skala dan berbilang-masa, penggabungan lintas-mod serta reka bentuk ringan bagi model kompleks.
Kata kunci: Ciri multispektral; data penderiaan jauh; pengekstrakan ciri permukaan; pengelasan semantik; rangkaian neural konvolusi (CNN)
REFERENCES
Alhassan,
V., Henry, C., Ramanna, S. & Storie,
C. 2020. A deep learning framework for land-use/land-cover mapping and analysis
using multispectral satellite imagery. Neural Computing and Applications 32: 8529-8544.
Anandakrishnan,
J., Sundaram, V.M. & Paneer, P. 2025. STA-AgriNet:
A spatio-temporal attention framework for crop type
mapping from fused multi-sensor multi-temporal SITS. IEEE Journal of
Selected Topics in Applied Earth Observations and Remote Sensing 18:
1817-1826.
Bishoff,
E., Godfrey, C., McKay, M. & Byler, E. 2023. Quantifying the robustness of
deep multispectral segmentation models against natural perturbations and data
poisoning. In Algorithms, Technologies, and Applications for Multispectral
and Hyperspectral Imaging XXIX, SPIE, 12519: 200-213.
Buttar,
P.K. & Sachan, M.K. 2024. Land cover segmentation
using 3D FCN-based architecture with coordinate attention. IEEE Geoscience
and Remote Sensing Letters 21: 2502905.
Chen, L.C., Papandreou, G., Schroff, F. & Adam, H. 2017. Rethinking atrous convolution for semantic image segmentation.arXiv preprint arXiv:1706.05587.
Ding,
L., Hong, D., Zhao, M., Chen, H., Li, C., Deng, J., Yokoya,
N., Bruzzone, L. & Chanussot,
J. 2025. A survey of sample-efficient deep learning for change detection in
remote sensing: Tasks, strategies, and challenges. IEEE Geoscience and
Remote Sensing Magazine 13(3): 164-189.
Du,
Y., Sheng, Q., Zhang, W., Zhu, C., Li, J. & Wang, B. 2023. From local
context-aware to non-local: A road extraction network via guidance of
multi-spectral image. ISPRS Journal of Photogrammetry and Remote Sensing 203: 230-245.
Gui,
Y., Li, W., Xia, X.G., Tao, R. & Yue, A. 2022. Infrared attention network
for woodland segmentation using multispectral satellite images. IEEE
Transactions on Geoscience and Remote Sensing 60: 5627214.
Han,
Z., Tian, Q., Tian, J., Zhao, T., Xu, C. & Zhou, Q. 2025. Estimation of
fractional cover based on NDVI-VISI response space using visible-near infrared
satellite imagery. International Journal of Applied Earth Observation and
Geoinformation 137: 104432.
He,
K., Chen, X., Xie, S., Li, Y., Dollár,
P. & Girshick, R. 2022. Masked autoencoders are
scalable vision learners. IEEE/CVF Conf. on Computer Vision and Pattern
Recognition https://doi.org/10.1109/CVPR52688.2022.01553
Hong,
D., Zhang, B., Li, H., Li, Y., Yao, J., Li, C., Werner, M., Chanussot,
J., Zipf, A. & Zhu, X.X. 2023. Cross-city
matters: A multimodal remote sensing benchmark dataset for cross-city semantic
segmentation using high-resolution domain adaptation networks. Remote
Sensing of Environment 299: 113856.
Jia,
J., Song, J., Kong, Q., Yang, H., Teng, Y. & Song, X. 2023.
Multi-attention-based semantic segmentation network for land cover remote
sensing images. Electronics 12(6): 1347.
Li,
J., Cai, Y., Li, Q., Kou, M. & Zhang, T. 2024. A review of remote sensing
image segmentation by deep learning methods. International Journal of
Digital Earth 17: 2328827.
Li,
R., Zheng, S., Zhang, C., Duan, C., Wang, L. &
Atkinson, P.M. 2021. ABCNet: Attentive bilateral
contextual network for efficient semantic segmentation of fine-resolution
remotely sensed imagery. ISPRS Journal of Photogrammetry and Remote Sensing 181: 84-98.
Lin,
L., Liu, L., Liu, M., Zhang, Q., Feng, M., Khalil, Y.S. & Yin, F. 2024. DEDNet: Dual-Encoder DeeplabV3+ network for rock glacier
recognition based on multispectral remote sensing image. Remote Sensing 16(14): 2603.
Mo,
W., Tan, Y., Zhou, Y., Zhi, Y., Cai, Y. & Ma, W.
2023. Multispectral remote sensing image change detection based on twin neural
networks. Electronics 12(18): 3766.
Muhtar,
D., Zhang, X. & Xiao, P. 2022. Index your position: A novel self-supervised
learning method for remote sensing images semantic segmentation. IEEE
Transactions on Geoscience and Remote Sensing 60: 4411511.
Nagaraj,
R. & Kumar, L.S. 2024. Extraction of surface water bodies using optical
remote sensing images: A review. Earth Science Informatics 17(2):
893-956.
Ramos,
L. & Sappa, A.D. 2024. Multispectral semantic
segmentation for land cover classification: An overview. IEEE Journal of
Selected Topics in Applied Earth Observations and Remote Sensing 17:
14295-14336.
Saralioglu,
E. & Gungor, O. 2022. Semantic segmentation of
land cover from high resolution multispectral satellite images by
spectral-spatial convolutional neural network. Geocarto International 37: 657-677.
Shen, X., Weng, L., Xia, M. & Others. 2022. Multi-scale feature aggregation network for semantic segmentation of land cover. Remote Sensing 14: 6156.
Sun,
J., Yin, M., Wang, Z., Xie, T. & Bei, S. 2024.
Multispectral object detection based on multilevel feature fusion and dual
feature modulation. Electronics 13(2): 443.
Tao,
C., Meng, Y., Li, J., Yang, B., Hu, F., Li, Y., Cui, C. & Zhang, W. 2022. MSNet: Multispectral semantic segmentation network for
remote sensing images. GIScience &
Remote Sensing 59(1): 1177-1198.
Thisanke,
H., Deshan, C., Chamith,
K., Seneviratne, S., Vidanaarachchi, R. & Herath, D. 2023. Semantic segmentation using vision
transformers: A survey. Engineering Applications of Artificial Intelligence 126(Part A): 106669.
Tong,
Z., Li, Y., Zhang, J., He, L. & Gong, Y. 2023. MSFANet:
Multiscale fusion attention network for road segmentation of multispectral
remote sensing data. Remote Sensing 15(8): 1978.
Ulku,
I. 2024. ContexNestedU-Net: Efficient context-aware
semantic segmentation architecture for precision agriculture applications based
on multispectral remote sensing imagery. Traitement du Signal 41(5): 2425-2436.
Wang,
L., Li, R., Zhang, C., Fang, S., Duan, C., Meng, X.
& Atkinson, P.M. 2022. UNetFormer: A UNet-like transformer for efficient semantic segmentation
of remote sensing urban scene imagery. ISPRS Journal of Photogrammetry and
Remote Sensing 190: 196-214.
Wang,
L., Li, R., Wang, D., Duan, C., Wang, T. & Meng,
X. 2021. Transformer meets convolution: A bilateral awareness network for
semantic segmentation of very fine resolution urban scene images. Remote
Sensing 13(16): 3065.
Wang,
Q., Hu, C., Wang, H., Wang, R., Xie, Y. & Zhao,
Y. 2024. Semantic segmentation of urban land classes using a multi-scale
dataset. International Journal of Remote Sensing 45(2): 653-675.
Wu,
X., Wang, P., Gong, Y., Zhang, Y., Wang, Q., Li, Y., Guo, J. & Han, S.
2024. Construction and application of dynamic threshold model for agricultural
drought grades based on near-infrared and short-wave infrared bands for spring
maize. Remote Sensing 16(17): 3260.
Xue,
Z., Yang, G., Yu, X., Yu, A., Guo, Y., Liu, B. & Zhou, J. 2025. Multimodal
self-supervised learning for remote sensing data land cover classification. Pattern
Recognition 157: 110959.
Yan,
Q., Zhang, S., Chen, X. & Zheng, Z. 2025. Multiscale superpixel depth feature extraction for hyperspectral image classification. Scientific
Reports 15(1): 13529.
Yu,
A., Quan, Y., Yu, R., Guo, W., Wang, X., Hong, D., Zhang, H., Chen, J., Hu, Q.
& He, P. 2023. Deep learning methods for semantic segmentation in remote
sensing with small data: A survey. Remote Sensing 15(20): 4987.
Yu,
H., Hou, Y., Wang, F., Wang, J., Zhu, J. & Guo, J. 2024. MSSFNet: A multiscale spatial-spectral fusion network for
extracting offshore floating raft aquaculture areas in multispectral remote
sensing images. Sensors 24(16): 5220.
Zhang, W. & Wang, A. 2023. Research on semantic segmentation method of remote sensing image based on self-supervised learning. International Journal of Advanced Computer Science and Applications 14(8). https://doi.org/10.14569/IJACSA.2023.0140855.
Zheng,
Y., Chen, Z., Zheng, T., Tian, C. & Dong, W. 2025. PSNet:
A universal algorithm for multispectral remote sensing image segmentation. Remote
Sensing 17(4): 563.
Zhu,
H., Tan, R., Han, L., Fan, H., Wang, Z., Du, B., Liu, S. & Liu, Q. 2022.
DSSM: A deep neural network with spectrum separable module for multi-spectral
remote sensing image segmentation. Remote Sensing 14(4): 818.
*Corresponding author; email: raihanimohamed@upm.edu.my